Factored Translation between Brazilian Portuguese and English
نویسندگان
چکیده
Factored translation is an extension of the state-of-theart phrase-based statistical machine translation (PB-SMT). The main difference in factored translation approach is that a word is not only a token (its surface form) but a vector composed of different information such as lemma, part-of-speech or morphologic/syntactic tags. In this paper we present some experiments carried out to train and test factored translation models on Brazilian Portuguese and English texts. Using part-of-speech and morphological information, the factored models showed better results than the baseline (a PB-SMT), but the same gain in performance was not reached when flat syntactic tags were considered.
منابع مشابه
‘Minor’ Languages, ‘Broken’ Translations: On Brazilian Reworkings of an Albanian Novel
This essay approaches the challenges of global translation in the 21st century from what might still be considered a somewhat uncommon example: a direct translation of Ismail Kadaré's 1978 novel Prill e thyër (Broken April) from the original Albanian into Brazilian Portuguese in 2001. Not only does it examine and compare lexical elements in the source and target texts and the usage of translato...
متن کاملFine-Tuning in Brazilian Portuguese-English Statistical Transfer Machine Translation: Verbal Tenses
This paper describes an experiment designed to evaluate the development of a Statistical Transfer-based Brazilian Portuguese to English Machine Translation system. We compare the performance of the system with the inclusion of new syntactic written rules concerning verbal tense between the Brazilian Portuguese and English languages. Results indicate that the system performance improved compared...
متن کاملLIHLA: A lexical aligner based on language-independent heuristics
Alignment of words and multiword units plays an important role in many natural language processing applications, such as example-based machine translation, transfer rule learning for machine translation, bilingual lexicography, word sense disambiguation, etc. In this paper we describe LIHLA, a lexical aligner which uses bilingual probabilistic lexicons generated by a freely available set of too...
متن کاملFully Automatic Compilation of Portuguese-English and Portuguese-Spanish Parallel Corpora
This paper reports the fully automatic compilation of parallel corpora for Brazilian Portuguese. Scientific news texts available in Brazilian Portuguese, English and Spanish are automatically crawled from a multilingual Brazilian magazine. The texts are then automatically aligned at documentand sentence-level. The resulting corpora contain about 2,700 parallel documents totaling over 150,000 al...
متن کاملTranslation and Validation of the Food Neophobia Scale (fns) to the Brazilian Portuguese.
INTRODUCTION The Food Neophobia Scale (FNS), originally developed in English, has been widely used in different studies to assess the individual's willingness to try new foods. However, a process of translation and cultural adaptation is required to enable the use of FNS in other countries. OBJECTIVE to translate and to validate the FNS into Brazilian Portuguese. METHODS the FNS was transla...
متن کامل